In robotics and computer vision communities, extensive studies have been widely conducted regarding surveillance tasks, including human detection, tracking, and motion recognition with a camera. Additionally, deep learning algorithms are widely utilized in the aforementioned tasks as in other computer vision tasks. Existing public datasets are insufficient to develop learning-based methods that handle various surveillance for outdoor and extreme situations such as harsh weather and low illuminance conditions. Therefore, we introduce a new large-scale outdoor surveillance dataset named eXtremely large-scale Multi-modAl Sensor dataset (X-MAS) containing more than 500,000 image pairs and the first-person view data annotated by well-trained annotators. Moreover, a single pair contains multi-modal data (e.g. an IR image, an RGB image, a thermal image, a depth image, and a LiDAR scan). This is the first large-scale first-person view outdoor multi-modal dataset focusing on surveillance tasks to the best of our knowledge. We present an overview of the proposed dataset with statistics and present methods of exploiting our dataset with deep learning-based algorithms. The latest information on the dataset and our study are available at https://github.com/lge-robot-navi, and the dataset will be available for download through a server.
translated by 谷歌翻译
视觉惯性探测器和猛击算法广泛用于各种领域,例如服务机器人,无人机和自动驾驶汽车。大多数SLAM算法都是基于地标是静态的。但是,在现实世界中,存在各种动态对象,它们会降低姿势估计精度。此外,暂时的静态对象,在观察过程中是静态的,但在视线视线时移动,触发假循环封闭。为了克服这些问题,我们提出了一个新颖的视觉惯性大满贯框架,称为dynavins,它对动态对象和暂时静态对象都具有强大的态度。在我们的框架中,我们首先提出一个可靠的捆绑捆绑调整,该调整可以通过利用IMU预融合估计的姿势先验来拒绝动态对象的功能。然后,提出了一个密钥帧分组和基于多种假设的约束分组方法,以减少循环闭合中暂时静态对象的效果。随后,我们在包含许多动态对象的公共数据集中评估了我们的方法。最后,通过成功拒绝动态和暂时静态对象的效果,我们的测力量与其他最先进方法相比,我们的测力素具有有希望的性能得到证实。我们的代码可在https://github.com/url-kaist/dynavins上找到。
translated by 谷歌翻译
在使用3D LiDAR传感器的3D感知领域中,地面分割是各种目的的必不可少的任务,例如可穿越的区域检测和对象识别。在这种情况下,已经提出了几种地面分割方法。但是,仍然遇到一些限制。首先,某些地面分割方法需要根据周围环境进行微调,这是过于费力且耗时的。此外,即使参数进行了充分的调整,部分分割问题仍然可能出现,这意味着某些地区的地面细分失败。最后,当地面在另一个结构(例如固定壁)之上时,地面分割方法通常无法估计适当的接地平面。为了解决这些问题,我们提出了一种称为PatchWork ++的强大地面分割方法,该方法是拼布的扩展。 Patchwork ++利用自适应地面可能性估计(A-GLE),根据先前的地面分割结果适应适当的参数。此外,暂时的地面还原(TGR)通过使用临时地面财产来减轻部分不及分段问题。同样,即使用不同的层抬高地面,也会引入区域垂直平面拟合(R-VPF),以正确分割接地平面。最后,我们提出反射的噪声去除(RNR),以根据3D激光雷达反射模型有效地消除虚拟噪声点。我们使用Semantickitti数据集证明了定性和定量评估。我们的代码可从https://github.com/url-kaist/patchwork-plusplus获得
translated by 谷歌翻译
与其他标准摄像机相反,事件摄像机以完全不同的方式来解释世界。作为异步事件的集合。尽管事件摄像头的独特数据输出,但许多事件功能检测和跟踪算法通过绕开基于框架的数据表示表现出了重大进展。本文质疑这样做的需求,并提出了一种新颖的事件数据友好方法,该方法可以实现同时的特征检测和跟踪,称为基于事件聚类的检测和跟踪(ECDT)。我们的方法采用一种新颖的聚类方法,称为基于K-NN分类器的空间聚类和噪声应用程序(KCSCAN)的应用,用于聚类相邻的极性事件以检索事件轨迹。借助头部和尾部描述符匹配过程,事件群集,在不同的极性中重新出现,不断跟踪,从而拉长了功能轨道。由于我们在时空空间中的聚类方法,我们的方法可以自动求解功能检测和特征跟踪。此外,ECDT可以使用可调的时间窗口以任何频率提取功能轨道,这不会破坏原始事件数据的高时间分辨率。与最先进的方法相比,我们的方法可以达到30%的特征跟踪年龄,同时也具有与其大约等于其的低误差。
translated by 谷歌翻译
从3D点云中对可遍历区域和感兴趣的对象的感知是自主导航中的关键任务之一。一辆地面车辆需要寻找可以通过车轮探索的可遍历的地形。然后,为了做出安全的导航决定,必须跟踪位于这些地形上的物体的分割。但是,过度分割和分割不足可能会对此类导航决策产生负面影响。为此,我们提出了旅行,该行程使用3D点云的图表表示可遍历的地面检测和对象聚类。为了将可穿越的接地段分割,将点云编码为图形结构,即三个格里德字段,该场将每个三个格里德视为节点。然后,通过检查连接节点的边缘的局部凸度和凹度来搜索和重新定义可遍历的区域。另一方面,我们的地上对象分割通过表示球形预测空间中的一组水平相邻的3D点作为节点和节点之间的垂直/水平关系,以使用图形结构。充分利用节点边缘结构,上面的分割可确保实时操作并减轻过度分割。通过使用模拟,城市场景和我们自己的数据集的实验,我们已经证明,根据常规指标,我们提出的遍历地面分割算法优于其他最新方法,并且我们新提出的评估指标对于评估是有意义的地上细分。我们将在https://github.com/url-kaist/travel上向公开提供代码和自己的数据集。
translated by 谷歌翻译
我们提出了一个动量重新识别(莫雷德)框架,该框架可以利用大量的负面样本来进行一般性重新识别任务。该框架的设计灵感来自动量对比度(MOCO),该对比度(MOCO)使用词典来存储当前和过去的批次来构建大量编码样品。由于我们发现使用过去的阳性样品与当前正面样品形成的编码特征属性高度不一致是有效的,因此,莫雷德(Moreid)设计仅使用词典中存储的大量负样品。但是,如果我们使用仅使用一个样品代表一组正/负样本的广泛使用的三重损失训练该模型,则很难有效利用莫比德框架获得的扩大的负样本集。为了最大程度地利用缩放的负样品集的优势,我们新引入强距离弹性损失(HE损失),该损失能够使用多个硬样品来表示大量样品。我们的实验表明,只有在HE损失的情况下,才能充分利用莫雷德框架提供的大量负样本,从而达到三个重新ID基准测试的最新准确性,即Veri-776,Market-1501和veri-wild。
translated by 谷歌翻译
The Coronavirus disease 2019 (COVID-19) was first identified in Wuhan, China, in early December 2019 and now becoming a pandemic. When COVID-19 patients undergo radiography examination, radiologists can observe the present of radiographic abnormalities from their chest X-ray (CXR) images. In this study, a deep convolutional neural network (CNN) model was proposed to aid radiologists in diagnosing COVID-19 patients. First, this work conducted a comparative study on the performance of modified VGG-16, ResNet-50 and DenseNet-121 to classify CXR images into normal, COVID-19 and viral pneumonia. Then, the impact of image augmentation on the classification results was evaluated. The publicly available COVID-19 Radiography Database was used throughout this study. After comparison, ResNet-50 achieved the highest accuracy with 95.88%. Next, after training ResNet-50 with rotation, translation, horizontal flip, intensity shift and zoom augmented dataset, the accuracy dropped to 80.95%. Furthermore, an ablation study on the effect of image augmentation on the classification results found that the combinations of rotation and intensity shift augmentation methods obtained an accuracy higher than baseline, which is 96.14%. Finally, ResNet-50 with rotation and intensity shift augmentations performed the best and was proposed as the final classification model in this work. These findings demonstrated that the proposed classification model can provide a promising result for COVID-19 diagnosis.
translated by 谷歌翻译
Feature acquisition algorithms address the problem of acquiring informative features while balancing the costs of acquisition to improve the learning performances of ML models. Previous approaches have focused on calculating the expected utility values of features to determine the acquisition sequences. Other approaches formulated the problem as a Markov Decision Process (MDP) and applied reinforcement learning based algorithms. In comparison to previous approaches, we focus on 1) formulating the feature acquisition problem as a MDP and applying Monte Carlo Tree Search, 2) calculating the intermediary rewards for each acquisition step based on model improvements and acquisition costs and 3) simultaneously optimizing model improvement and acquisition costs with multi-objective Monte Carlo Tree Search. With Proximal Policy Optimization and Deep Q-Network algorithms as benchmark, we show the effectiveness of our proposed approach with experimental study.
translated by 谷歌翻译
Uniform-precision neural network quantization has gained popularity since it simplifies densely packed arithmetic unit for high computing capability. However, it ignores heterogeneous sensitivity to the impact of quantization errors across the layers, resulting in sub-optimal inference accuracy. This work proposes a novel neural architecture search called neural channel expansion that adjusts the network structure to alleviate accuracy degradation from ultra-low uniform-precision quantization. The proposed method selectively expands channels for the quantization sensitive layers while satisfying hardware constraints (e.g., FLOPs, PARAMs). Based on in-depth analysis and experiments, we demonstrate that the proposed method can adapt several popular networks channels to achieve superior 2-bit quantization accuracy on CIFAR10 and ImageNet. In particular, we achieve the best-to-date Top-1/Top-5 accuracy for 2-bit ResNet50 with smaller FLOPs and the parameter size.
translated by 谷歌翻译
This study introduces and examines the potential of an AI system to generate health awareness messages. The topic of folic acid, a vitamin that is critical during pregnancy, served as a test case. Using prompt engineering, we generated messages that could be used to raise awareness and compared them to retweeted human-generated messages via computational and human evaluation methods. The system was easy to use and prolific, and computational analyses revealed that the AI-generated messages were on par with human-generated ones in terms of sentiment, reading ease, and semantic content. Also, the human evaluation study showed that AI-generated messages ranked higher in message quality and clarity. We discuss the theoretical, practical, and ethical implications of these results.
translated by 谷歌翻译